Load the necessary libraries
library(car) # for regression diagnostics
library(broom) # for tidy output
library(broom.mixed) # for tidy output
library(ggfortify) # for model diagnostics
library(sjPlot) # for outputs
library(knitr) # for kable
library(effects) # for partial effects plots
library(ggeffects) # for effects plots in ggplotjk
library(emmeans) # for estimating marginal means
library(MASS) # for glm.nb
library(MuMIn) # for AICc
library(tidyverse) # for data wrangling
library(DHARMa) # for assessing dispersion etc
library(glmmTMB) # for glmmTMB
library(performance) # for diagnostic plots
library(see) # for diagnostic plots
library(lme4) # for glmer
library(glmmTMB) # for glmmTMB
Some ornithologists were interested in the degree of sibling negotiations in owl chicks. Specifically, they wanted to explore how sibling negotiations were affected by feeding satiety and the sex of the parent returning to the nest. The ornithologists had accessed to a number of owl nests and were able to count (via recording equipment) the number of sibling negotiations (calls) that the owl chicks made when the parent returned to the nest.
We could hypothesise that the chicks might call more if they were hungry. As part of the investigation, the researchers were able to provided supplementary food. As such, they were able to manipulate the conditions such that sometimes the chicks in a nest would be considered deprived of supplementary food and at other times they were satiated.
As a parent returned, the researchers recorded the number of sibling negotiations (calls) along with the sex of the parent. Since the number of calls is likely to be a function of the number of chicks (the more chicks the more calls), the researchers also counted the number of siblings in the brood.
Each nest was measured on multiple occasions. Hence, we must include the nest as a random effect to account for the lack of independence between observations on the same set of siblings.
owls <- read_csv("../public/data/owls.csv", trim_ws = TRUE)
glimpse(owls)
## Rows: 599
## Columns: 7
## $ Nest <chr> "AutavauxTV", "AutavauxTV", "AutavauxTV", "Autavaux…
## $ FoodTreatment <chr> "Deprived", "Satiated", "Deprived", "Deprived", "De…
## $ SexParent <chr> "Male", "Male", "Male", "Male", "Male", "Male", "Ma…
## $ ArrivalTime <dbl> 22.25, 22.38, 22.53, 22.56, 22.61, 22.65, 22.76, 22…
## $ SiblingNegotiation <dbl> 4, 0, 2, 2, 2, 2, 18, 4, 18, 0, 0, 3, 0, 3, 3, 6, 0…
## $ BroodSize <dbl> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, …
## $ NegPerChick <dbl> 0.8, 0.0, 0.4, 0.4, 0.4, 0.4, 3.6, 0.8, 3.6, 0.0, 0…
Let start by declaring the categorical variables and random effect as factors.
## Amount of Sibling negotiation (vocalizations when parents are absent)
## Foot treatment (deprived or satiated
## Sex of parent
## Arrival time of parent
## Nest as random
## Brood size offset
owls <- owls %>% mutate(
Nest = factor(Nest),
FoodTreatment = factor(FoodTreatment),
SexParent = factor(SexParent),
NCalls = SiblingNegotiation
)
As the response represents counts (the number of calls), it would make sense to start by considering a Poisson model. We could attempt to model the response as the number of calls divided by the brood size, but this would result in a response that has no natural distribution.
Instead, if we include brood size as an offset, it will standardise the effects according to brood size (similar to having divided the response by brood size), yet retain the Poisson nature of the response.
The effects of offsets, unlike regular covariates, are not estimated. Rather they are assumed to be 1 (on the link scale). This means that since Poisson uses a log link, then the offset should be of a logged version of the brood size.
Model formula: \[ y_i \sim{} \mathcal{Pois}(\lambda_i)\\ ln(\lambda_i) =\boldsymbol{\beta} \bf{X_i} + \boldsymbol{\gamma} \bf{Z_i} \]
where \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\) are vectors of the fixed and random effects parameters respectively and \(\bf{X}\) is the model matrix representing the overall intercept and effects of food treatment, sex of parent, arrival time (and various interactions) on the number of sibling negotiations. Brood size was also incorporated as an offset. \(\bf{Z}\) represents a cell means model matrix for the random intercepts associated with individual nests.
Perhaps we could start off by exploring the main fixed effects. To mimic the log-link, we will use a log-transformed y axis. Since there may well be zeros (no calls detected), we will use a pseudo log scale). We will also include the raw data (jittered and dodged)
ggplot(data = owls, aes(y = NCalls, x = FoodTreatment, color = SexParent)) +
geom_violin() +
geom_point()
ggplot(data = owls, aes(y = NCalls, x = FoodTreatment, color = SexParent)) +
geom_violin() +
geom_point(position = position_jitterdodge(jitter.width = 0.2, dodge.width = 0.9))
ggplot(data = owls, aes(y = NCalls, x = FoodTreatment, color = SexParent)) +
geom_violin() +
geom_point(position = position_jitterdodge(jitter.height = 0, dodge.width = 1)) +
scale_y_continuous(trans = scales::pseudo_log_trans())
Now, a similar plot separated for each nest.
ggplot(data = owls) +
geom_point(aes(y = NCalls, x = FoodTreatment, color = SexParent), position = position_dodge(0.5)) +
facet_wrap(~Nest)
It might also be worth establishing that there is a linear relationship between the number of calls and brood size. Again, we will mimic the use of the log-link by transforming the axes.
ggplot(data = owls, aes(y = NCalls, x = BroodSize, color = SexParent)) +
geom_point() +
geom_smooth(method = "lm") +
facet_grid(~FoodTreatment) +
scale_y_continuous(trans = scales::pseudo_log_trans()) +
scale_x_log10()
owls.glmer1 <- glmer(NCalls ~ 1 + offset(log(BroodSize)) + (1 | Nest),
data = owls,
family = poisson(link = "log")
)
owls.glmer2 <- glmer(NCalls ~ 1 + offset(log(BroodSize)) + (FoodTreatment | Nest),
data = owls,
family = poisson(link = "log")
)
owls.glmer3 <- glmer(NCalls ~ 1 + offset(log(BroodSize)) + (SexParent | Nest),
data = owls,
family = poisson(link = "log")
)
owls.glmer4 <- glmer(NCalls ~ 1 + offset(log(BroodSize)) + (FoodTreatment * SexParent | Nest),
data = owls,
family = poisson(link = "log")
)
## owls.glmer1a <- owls.glmer1
## owls.glmer1b <- update(owls.glmer1a, ~ . - (1|Nest) + (FoodTreatment|Nest))
## owls.glmer1c <- update(owls.glmer1a, ~ . - (1|Nest) + (SexParent|Nest))
## owls.glmer1d <- update(owls.glmer1a, ~ . - (1|Nest) + (FoodTreatment*SexParent|Nest))
## owls.allFit <- allFit(owls.glmer1d)
## owls.allFit
## ## Check which of the models are considered valid (OK)
## is.OK <- sapply(owls.allFit, is, "merMod")
## is.OK
## diff_optims.OK <- owls.allFit[is.OK]
## lapply(diff_optims.OK,function(x) x@optinfo$conv$lme4$messages)
## owls.glmer1d <- update(owls.glmer1d, control=glmerControl(optimizer='bobyqa'))
## owls.glmer1c <- update(owls.glmer1c, control=glmerControl(optimizer='bobyqa'))
## owls.glmer1a <- update(owls.glmer1a, control=glmerControl(optimizer='bobyqa'))
## owls.glmer1b <- update(owls.glmer1b, control=glmerControl(optimizer='bobyqa'))
AICc(owls.glmer1, owls.glmer2, owls.glmer3, owls.glmer4)
Conclusions:
owls.glmer4a <- update(owls.glmer4, . ~ . + FoodTreatment * SexParent)
owls.glmer4a <- update(owls.glmer4, . ~ . + FoodTreatment * SexParent,
control = glmerControl(optimizer = "bobyqa")
)
owls.glmer4b <- update(owls.glmer4, . ~ . + FoodTreatment + SexParent)
AICc(owls.glmer4a, owls.glmer4b)
anova(owls.glmer4a, owls.glmer4b)
Conclusions:
owls.glmmTMB1 <- glmmTMB(NCalls ~ 1 + offset(log(BroodSize)) + (1 | Nest),
data = owls,
family = poisson(link = "log"),
REML = TRUE
)
owls.glmmTMB2 <- glmmTMB(NCalls ~ 1 + offset(log(BroodSize)) + (FoodTreatment | Nest),
data = owls,
family = poisson(link = "log")
)
owls.glmmTMB3 <- glmmTMB(NCalls ~ 1 + offset(log(BroodSize)) + (SexParent | Nest),
data = owls,
family = poisson(link = "log")
)
owls.glmmTMB4 <- glmmTMB(NCalls ~ 1 + offset(log(BroodSize)) + (FoodTreatment * SexParent | Nest),
data = owls,
family = poisson(link = "log")
)
AICc(owls.glmmTMB1, owls.glmmTMB2, owls.glmmTMB3, owls.glmmTMB4)
## owls.glmmTMB1a <- update(owls.glmmTMB1, REML=TRUE)
## owls.glmmTMB1b <- update(owls.glmmTMB1a, ~ . - (1|Nest) + (FoodTreatment|Nest))
## owls.glmmTMB1c <- update(owls.glmmTMB1a, ~ . - (1|Nest) + (SexParent|Nest))
## owls.glmmTMB1d <- update(owls.glmmTMB1a, ~ . - (1|Nest) + (FoodTreatment*SexParent|Nest))
## AICc(owls.glmmTMB1a, owls.glmmTMB1b, owls.glmmTMB1c, owls.glmmTMB1d)
Conclusions:
owls.glmmTMB4a <- update(owls.glmmTMB4, . ~ . + FoodTreatment * SexParent)
owls.glmmTMB4b <- update(owls.glmmTMB4, . ~ . + FoodTreatment + SexParent)
AICc(owls.glmmTMB4a, owls.glmmTMB4b)
anova(owls.glmmTMB4a, owls.glmmTMB4b)
## owls.glmmTMB1 <- glmmTMB(NCalls ~ FoodTreatment*SexParent + offset(log(BroodSize))
## + (1|Nest), data=owls,
## family=poisson(link='log'), REML=FALSE)
## owls.glmmTMB2 <- update(owls.glmmTMB1, ~ . - FoodTreatment:SexParent)
## AICc(owls.glmmTMB1, owls.glmmTMB2)
## anova(owls.glmmTMB1, owls.glmmTMB2)
Conclusions:
owls.glmer4a %>% plot_model(type = "diag")
## $Nest
Conclusions:
owls.glmer4a %>% performance::check_model()
Conclusions:
It is also possible to predict which modelling family would be the most suitable for the data. This can be attempted via an experimental routine in the performance package that uses random forests to classify a range of possible distributions.
owls.glmer4a %>% performance::check_distribution()
Conclusions:
We can explore over-dispersion more formally:
owls.glmer4a %>% performance::check_overdispersion()
## # Overdispersion test
##
## dispersion ratio = 4.211
## Pearson's Chi-Squared = 2463.718
## p-value = < 0.001
Conclusions:
Over-dispersion could be caused by numerous factors:
We can explore zero-inflation directly:
owls.glmer4a %>% performance::check_zeroinflation()
## # Check for zero-inflation
##
## Observed zeros: 156
## Predicted zeros: 44
## Ratio: 0.28
Conclusions:
owls.resid <- owls.glmer4a %>% simulateResiduals(plot = TRUE, integerResponse = TRUE)
Conclusions:
Perhaps we should specifically explore zero-inflation.
owls.resid %>% testZeroInflation()
##
## DHARMa zero-inflation test via comparison to expected zeros with
## simulation under H0 = fitted model
##
## data: simulationOutput
## ratioObsSim = 2.6233, p-value < 2.2e-16
## alternative hypothesis: two.sided
Conclusions:
The data were collected at various times throughout the night. It is possible that this could lead to patterns of dependency that are not already accounted for. For example, perhaps observations that are collected at similar time of the night (within a given nest) have more similar residuals than those at very different time of the night. We can explore whether there are any temporal autocorrelation patterns.
owls.resid %>% testTemporalAutocorrelation(time = owls$ArrivalTime)
## Error in testTemporalAutocorrelation(., time = owls$ArrivalTime): testing for temporal autocorrelation requires unique time values - if you have several observations per time value, either use the recalculateResiduals function to aggregate residuals per time step, or extract the residuals from the fitted object, and plot / test each of them independently for temporally repeated subgroups (typical choices would be location / subject etc.). Note that the latter must be done by hand, outside testTemporalAutocorrelation.
owls.resid1 <- owls.resid %>% recalculateResiduals(group = interaction(owls$ArrivalTime, owls$Nest), aggregateBy = mean)
owls.resid1 %>% testTemporalAutocorrelation(time = unique(owls$ArrivalTime))
## Error in testTemporalAutocorrelation(., time = unique(owls$ArrivalTime)): Dimensions of time don't match the dimension of the residuals
Conclusions:
owls.glmmTMB4a %>% plot_model(type = "diag")
## $Nest
Conclusions:
owls.glmmTMB4a %>% performance::check_model()
Conclusions:
It is also possible to predict which modelling family would be the most suitable for the data. This can be attempted via an experimental routine in the performance package that uses random forests to classify a range of possible distributions.
owls.glmmTMB4a %>% performance::check_distribution()
Conclusions:
We can explore over-dispersion more formally:
owls.glmmTMB4a %>% performance::check_overdispersion()
## # Overdispersion test
##
## dispersion ratio = 4.211
## Pearson's Chi-Squared = 2463.709
## p-value = < 0.001
Conclusions:
Over-dispersion could be caused by numerous factors:
We can explore zero-inflation directly:
performance::check_zeroinflation(owls.glmmTMB4a)
## # Check for zero-inflation
##
## Observed zeros: 156
## Predicted zeros: 44
## Ratio: 0.28
Conclusions:
owls.resid <- owls.glmmTMB4a %>% simulateResiduals(plot = TRUE, integerResponse = TRUE)
Conclusions:
Perhaps we should specifically explore zero-inflation.
owls.resid %>% testZeroInflation()
##
## DHARMa zero-inflation test via comparison to expected zeros with
## simulation under H0 = fitted model
##
## data: simulationOutput
## ratioObsSim = 2.6264, p-value < 2.2e-16
## alternative hypothesis: two.sided
Conclusions:
The data were collected at various times throughout the night. It is possible that this could lead to patterns of dependency that are not already accounted for. For example, perhaps observations that are collected at similar time of the night (within a given nest) have more similar residuals than those at very different time of the night. We can explore whether there are any temporal autocorrelation patterns.
owls.resid %>% testTemporalAutocorrelation(time = owls$ArrivalTime)
## Error in testTemporalAutocorrelation(., time = owls$ArrivalTime): testing for temporal autocorrelation requires unique time values - if you have several observations per time value, either use the recalculateResiduals function to aggregate residuals per time step, or extract the residuals from the fitted object, and plot / test each of them independently for temporally repeated subgroups (typical choices would be location / subject etc.). Note that the latter must be done by hand, outside testTemporalAutocorrelation.
owls.resid1 <- owls.resid %>% recalculateResiduals(group = interaction(owls$ArrivalTime, owls$Nest), aggregateBy = mean) <-
owls.resid1 %>% testTemporalAutocorrelation(time = unique(owls$ArrivalTime))
## Error in testTemporalAutocorrelation(., time = unique(owls$ArrivalTime)): Dimensions of time don't match the dimension of the residuals
Conclusions:
Conclusions:
glmer(), so we will proceed with glmmTMB() only.Zero-inflated vs hurdle models
owls.glmmTMB5 <- glmmTMB(NCalls ~ FoodTreatment * SexParent + offset(log(BroodSize)) +
(FoodTreatment * SexParent | Nest),
ziformula = ~1, data = owls,
family = poisson(link = "log"),
REML = TRUE
)
# OR
owls.glmmTMB5 <- update(owls.glmmTMB4a, ziformula = ~1)
Model validation
owls.resid <- owls.glmmTMB5 %>% simulateResiduals(plot = TRUE, integerResponse = TRUE)
owls.resid %>% testZeroInflation()
##
## DHARMa zero-inflation test via comparison to expected zeros with
## simulation under H0 = fitted model
##
## data: simulationOutput
## ratioObsSim = 1.0111, p-value = 0.936
## alternative hypothesis: two.sided
owls.resid %>% testDispersion()
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.1602, p-value = 0.216
## alternative hypothesis: two.sided
## owls.glmmTMB5 %>% performance::check_overdispersion()
owls.resid %>% testUniformity()
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.076864, p-value = 0.001687
## alternative hypothesis: two-sided
owls.resid %>% testQuantiles()
##
## Test for location of quantiles via qgam
##
## data: simulationOutput
## p-value = 0.03375
## alternative hypothesis: both
owls.resid %>% testResiduals()
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.076864, p-value = 0.001687
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.1602, p-value = 0.216
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 4, observations = 599, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.001822384 0.017008907
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.006677796
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.076864, p-value = 0.001687
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.1602, p-value = 0.216
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 4, observations = 599, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.001822384 0.017008907
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.006677796
owls.glmmTMB6 <- glmmTMB(NCalls ~ FoodTreatment * SexParent + offset(log(BroodSize)) +
(FoodTreatment * SexParent | Nest),
ziformula = ~ FoodTreatment * SexParent, data = owls,
family = poisson(link = "log"), REML = TRUE
)
Model validation
owls.resid <- owls.glmmTMB4 %>% simulateResiduals(plot = TRUE, integerResponse = TRUE)
owls.resid %>% testZeroInflation()
##
## DHARMa zero-inflation test via comparison to expected zeros with
## simulation under H0 = fitted model
##
## data: simulationOutput
## ratioObsSim = 4.5003, p-value < 2.2e-16
## alternative hypothesis: two.sided
owls.resid %>% testDispersion()
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.00070878, p-value = 0.008
## alternative hypothesis: two.sided
owls.resid %>% testUniformity()
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.21799, p-value < 2.2e-16
## alternative hypothesis: two-sided
owls.resid %>% testQuantiles()
## NULL
owls.resid %>% testResiduals()
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.21799, p-value < 2.2e-16
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.00070878, p-value = 0.008
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 34, observations = 599, p-value < 2.2e-16
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.03962543 0.07841804
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.05676127
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.21799, p-value < 2.2e-16
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.00070878, p-value = 0.008
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 34, observations = 599, p-value < 2.2e-16
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.03962543 0.07841804
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.05676127
owls.glmmTMB7 <- glmmTMB(NCalls ~ FoodTreatment * SexParent + offset(log(BroodSize)) +
(FoodTreatment * SexParent | Nest),
ziformula = ~1, data = owls,
family = nbinom2(link = "log"), REML = TRUE
)
Model validation
owls.resid <- owls.glmmTMB5 %>% simulateResiduals(plot = TRUE, integerResponse = TRUE)
owls.resid %>% testZeroInflation()
##
## DHARMa zero-inflation test via comparison to expected zeros with
## simulation under H0 = fitted model
##
## data: simulationOutput
## ratioObsSim = 1.0111, p-value = 0.936
## alternative hypothesis: two.sided
owls.resid %>% testDispersion()
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.1602, p-value = 0.216
## alternative hypothesis: two.sided
owls.resid %>% testUniformity()
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.076864, p-value = 0.001687
## alternative hypothesis: two-sided
owls.resid %>% testQuantiles()
##
## Test for location of quantiles via qgam
##
## data: simulationOutput
## p-value = 0.03375
## alternative hypothesis: both
owls.resid %>% testResiduals()
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.076864, p-value = 0.001687
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.1602, p-value = 0.216
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 4, observations = 599, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.001822384 0.017008907
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.006677796
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.076864, p-value = 0.001687
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.1602, p-value = 0.216
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 4, observations = 599, p-value = 1
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.001822384 0.017008907
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.006677796
owls.glmmTMB8 <- glmmTMB(NCalls ~ FoodTreatment * SexParent + offset(log(BroodSize)) +
(FoodTreatment + SexParent | Nest),
ziformula = ~ FoodTreatment * SexParent, data = owls,
family = nbinom2(link = "log"), REML = TRUE
)
Model validation
owls.resid <- owls.glmmTMB6 %>% simulateResiduals(plot = TRUE, integerResponse = TRUE)
owls.resid %>% testZeroInflation()
##
## DHARMa zero-inflation test via comparison to expected zeros with
## simulation under H0 = fitted model
##
## data: simulationOutput
## ratioObsSim = 1.0004, p-value = 1
## alternative hypothesis: two.sided
owls.resid %>% testDispersion()
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.36058, p-value = 0.264
## alternative hypothesis: two.sided
owls.resid %>% testUniformity()
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.064467, p-value = 0.01376
## alternative hypothesis: two-sided
owls.resid %>% testQuantiles()
##
## Test for location of quantiles via qgam
##
## data: simulationOutput
## p-value = 0.01428
## alternative hypothesis: both
testResiduals(owls.resid)
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.064467, p-value = 0.01376
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.36058, p-value = 0.264
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 6, observations = 599, p-value = 0.4882
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.003684572 0.021673866
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.01001669
## $uniformity
##
## One-sample Kolmogorov-Smirnov test
##
## data: simulationOutput$scaledResiduals
## D = 0.064467, p-value = 0.01376
## alternative hypothesis: two-sided
##
##
## $dispersion
##
## DHARMa nonparametric dispersion test via sd of residuals fitted vs.
## simulated
##
## data: simulationOutput
## dispersion = 0.36058, p-value = 0.264
## alternative hypothesis: two.sided
##
##
## $outliers
##
## DHARMa outlier test based on exact binomial test with approximate
## expectations
##
## data: simulationOutput
## outliers at both margin(s) = 6, observations = 599, p-value = 0.4882
## alternative hypothesis: true probability of success is not equal to 0.007968127
## 95 percent confidence interval:
## 0.003684572 0.021673866
## sample estimates:
## frequency of outliers (expected: 0.00796812749003984 )
## 0.01001669
AICc(owls.glmmTMB4a, owls.glmmTMB5, owls.glmmTMB6, owls.glmmTMB7, owls.glmmTMB8)
Conclusions:
glmmTMB8 is the ‘best’ model to pursue.owls.glmmTMB8 %>% plot_model(type = "eff", terms = c("FoodTreatment", "SexParent"))
These predictions appear to be based on the mean BroodSize of approximately 4.39. That is, the predictions are for a nest, not per chick. There does not appear to be a way to indicate the offset value.
owls.glmmTMB8 %>%
allEffects() %>%
plot(multiline = TRUE, ci.style = "bars")
These predictions also appear to be based on the mean BroodSize, although the documentation seems to suggest that allEffects() might not deal with the offsets the way we have used them (as a function in the formula) correctly.
owls.glmmTMB8 %>%
ggpredict(terms = c("FoodTreatment", "SexParent")) %>%
plot()
This seems to deal with the offset incorrectly. For the purpose of prediction, the offset seems to be set at the value of the first BroodSize (on the response scale). This is incorrect for two reasons:
ggemmeans() can accommodate the offset correctly. There are two sensible choices:
# off<-owls %>% group_by(SexParent, FoodTreatment) %>% summarize(Mean=mean(BroodSize))
off <- owls %>% summarize(Mean = mean(BroodSize))
as.numeric(off)
## [1] 4.392321
owls.glmmTMB8 %>%
ggemmeans(~ FoodTreatment + SexParent, offset = log(off$Mean)) %>%
plot()
owls.glmmTMB8 %>%
ggemmeans(~ FoodTreatment + SexParent, offset = 0) %>%
plot()
owls.glmmTMB8 %>% summary()
## Family: nbinom2 ( log )
## Formula:
## NCalls ~ FoodTreatment * SexParent + offset(log(BroodSize)) +
## (FoodTreatment + SexParent | Nest)
## Zero inflation: ~FoodTreatment * SexParent
## Data: owls
##
## AIC BIC logLik deviance df.resid
## 3372.7 3438.6 -1671.4 3342.7 588
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev. Corr
## Nest (Intercept) 0.11739 0.3426
## FoodTreatmentSatiated 1.24417 1.1154 0.18
## SexParentMale 0.03551 0.1884 -0.66 -0.72
## Number of obs: 599, groups: Nest, 27
##
## Dispersion parameter for nbinom2 family (): 2.72
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.8367 0.1059 7.905 2.69e-15 ***
## FoodTreatmentSatiated -0.8273 0.2691 -3.074 0.00211 **
## SexParentMale -0.1027 0.1070 -0.959 0.33735
## FoodTreatmentSatiated:SexParentMale 0.1838 0.1742 1.055 0.29160
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Zero-inflation model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.5276 0.2580 -5.921 3.2e-09 ***
## FoodTreatmentSatiated 0.7200 0.3582 2.010 0.0444 *
## SexParentMale -0.9590 0.4056 -2.364 0.0181 *
## FoodTreatmentSatiated:SexParentMale 0.8334 0.5177 1.610 0.1074
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusions:
owls.glmmTMB8 %>% tidy(conf.int = TRUE)
## or on the response scale
owls.glmmTMB8 %>% tidy(conf.int = TRUE, exponentiate = TRUE)
owls.glmmTMB8 %>%
tidy(conf.int = TRUE, exponentiate = TRUE) %>%
kable()
| effect | component | group | term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|---|---|---|
| fixed | cond | NA | (Intercept) | 2.3088294 | 0.2443986 | 7.9046753 | 0.0000000 | 1.8762420 | 2.8411544 |
| fixed | cond | NA | FoodTreatmentSatiated | 0.4372408 | 0.1176684 | -3.0740337 | 0.0021119 | 0.2580173 | 0.7409562 |
| fixed | cond | NA | SexParentMale | 0.9023997 | 0.0965938 | -0.9594244 | 0.3373450 | 0.7316198 | 1.1130442 |
| fixed | cond | NA | FoodTreatmentSatiated:SexParentMale | 1.2017248 | 0.2093898 | 1.0546189 | 0.2915996 | 0.8540648 | 1.6909052 |
| fixed | zi | NA | (Intercept) | 0.2170499 | 0.0559987 | -5.9210570 | 0.0000000 | 0.1309032 | 0.3598893 |
| fixed | zi | NA | FoodTreatmentSatiated | 2.0545162 | 0.7358288 | 2.0104333 | 0.0443853 | 1.0182400 | 4.1454243 |
| fixed | zi | NA | SexParentMale | 0.3832871 | 0.1554650 | -2.3642703 | 0.0180656 | 0.1730887 | 0.8487500 |
| fixed | zi | NA | FoodTreatmentSatiated:SexParentMale | 2.3010863 | 1.1912373 | 1.6098239 | 0.1074363 | 0.8342166 | 6.3472700 |
| ran_pars | cond | Nest | sd__(Intercept) | 0.3426284 | NA | NA | NA | 0.1974161 | 0.5946535 |
| ran_pars | cond | Nest | sd__FoodTreatmentSatiated | 1.1154252 | NA | NA | NA | 0.7174586 | 1.7341397 |
| ran_pars | cond | Nest | sd__SexParentMale | 0.1884325 | NA | NA | NA | 0.0413314 | 0.8590753 |
| ran_pars | cond | Nest | cor__(Intercept).FoodTreatmentSatiated | 0.1809264 | NA | NA | NA | -0.4469640 | 0.6553227 |
| ran_pars | cond | Nest | cor__(Intercept).SexParentMale | -0.6557004 | NA | NA | NA | -0.6789738 | 0.6474187 |
| ran_pars | cond | Nest | cor__FoodTreatmentSatiated.SexParentMale | -0.7164181 | NA | NA | NA | -0.3487782 | 0.9915544 |
Conclusions:
# warning this is only appropriate for html output
owls.glmmTMB8 %>% sjPlot::tab_model(show.se = TRUE, show.aic = TRUE)
| NCalls | |||||
|---|---|---|---|---|---|
| Predictors | Incidence Rate Ratios | std. Error | CI | p | |
| Count Model | |||||
| (Intercept) | 2.31 | 0.24 | 1.88 – 2.84 | <0.001 | |
| FoodTreatment [Satiated] | 0.44 | 0.12 | 0.26 – 0.74 | 0.002 | |
| SexParent [Male] | 0.90 | 0.10 | 0.73 – 1.11 | 0.337 | |
|
FoodTreatment [Satiated] * SexParent [Male] |
1.20 | 0.21 | 0.85 – 1.69 | 0.292 | |
| (Intercept) | 2.72 | ||||
| Zero-Inflated Model | |||||
| (Intercept) | 0.22 | 0.06 | 0.13 – 0.36 | <0.001 | |
| FoodTreatment [Satiated] | 2.05 | 0.74 | 1.02 – 4.15 | 0.044 | |
| SexParent [Male] | 0.38 | 0.16 | 0.17 – 0.85 | 0.018 | |
|
FoodTreatment [Satiated] * SexParent [Male] |
2.30 | 1.19 | 0.83 – 6.35 | 0.107 | |
| Random Effects | |||||
| σ2 | 0.50 | ||||
| τ00 Nest | 0.12 | ||||
| τ11 Nest.FoodTreatmentSatiated | 1.24 | ||||
| τ11 Nest.SexParentMale | 0.04 | ||||
| ρ01 | 0.18 | ||||
| -0.66 | |||||
| ICC | 0.56 | ||||
| N Nest | 27 | ||||
| Observations | 599 | ||||
| Marginal R2 / Conditional R2 | 0.102 / 0.609 | ||||
| AIC | 3372.702 | ||||
options(width = 100)
owls.glmmTMB8 %>% tidy(conf.int = TRUE)
plogis(-1.53)
exp(-1.53)
owls.glmmTMB8 %>% tidy(effects = "fixed", conf.int = TRUE, exponentiate = TRUE)
## owls.glmmTMB8 %>% r.squaredGLMM()
owls.glmmTMB8 %>% performance::r2_nakagawa()
## # R2 for Mixed Models
##
## Conditional R2: 0.609
## Marginal R2: 0.102
## owls.glmmTMB8 %>% performance::r2_zeroinflated()
newdata <- owls.glmmTMB8 %>%
emmeans(~ FoodTreatment + SexParent,
offset = 0, type = "response"
) %>%
as.data.frame()
head(newdata)
ggplot(newdata, aes(y = response, x = FoodTreatment)) +
geom_pointrange(aes(ymin = lower.CL, ymax = upper.CL, color = SexParent),
position = position_dodge(width = 0.2)
) +
scale_y_continuous("Number of sibling negotiations per chick") +
theme_bw()
## OR if we want to express this for the average brood size
newdata <- owls.glmmTMB8 %>%
emmeans(~ FoodTreatment + SexParent,
offset = log(mean(owls$BroodSize)), type = "response"
) %>%
as.data.frame()
head(newdata)
ggplot(newdata, aes(y = response, x = FoodTreatment)) +
geom_pointrange(aes(ymin = lower.CL, ymax = upper.CL, color = SexParent),
position = position_dodge(width = 0.2)
) +
scale_y_continuous("Number of sibling negotiations per nest") +
theme_bw()
newdata <- tidy(owls.glmmTMB5, effects = "fixed", conf.int = TRUE, exponentiate = TRUE) %>%
mutate(Model = "zip (simple zi)") %>%
bind_rows(
tidy(owls.glmmTMB6, effects = "fixed", conf.int = TRUE, exponentiate = TRUE) %>%
mutate(Model = "zip (complex zi)")
) %>%
bind_rows(
tidy(owls.glmmTMB7, effects = "fixed", conf.int = TRUE, exponentiate = TRUE) %>%
mutate(Model = "zinb (simple zi)")
) %>%
bind_rows(
tidy(owls.glmmTMB8, effects = "fixed", conf.int = TRUE, exponentiate = TRUE) %>%
mutate(Model = "zinb (complex zi)")
) %>%
mutate(
Model = factor(Model, levels = c(
"zip (simple zi)", "zip (complex zi)",
"zinb (simple zi)", "zinb (complex zi)"
)),
Cond = interaction(component, term)
) %>%
arrange(component, term) %>%
mutate(Cond = factor(Cond,
levels = rev(unique(Cond))
))
ggplot(newdata, aes(y = estimate, x = Cond, color = Model)) +
geom_pointrange(aes(ymin = conf.low, ymax = conf.high), position = position_dodge(width = 0.2)) +
coord_flip()
newdata <- emmeans(owls.glmmTMB5, ~ FoodTreatment + SexParent, offset = 0, type = "response") %>%
as.data.frame() %>%
mutate(Model = "zip (simple zi)", response = rate) %>%
bind_rows(
emmeans(owls.glmmTMB6, ~ FoodTreatment + SexParent, offset = 0, type = "response") %>%
as.data.frame() %>% mutate(Model = "zip (complex zi)", response = rate)
) %>%
bind_rows(
emmeans(owls.glmmTMB7, ~ FoodTreatment + SexParent, offset = 0, type = "response") %>%
as.data.frame() %>% mutate(Model = "zinb (simple zi)", response = response)
) %>%
bind_rows(
emmeans(owls.glmmTMB8, ~ FoodTreatment + SexParent, offset = 0, type = "response") %>%
as.data.frame() %>% mutate(Model = "zinb (complex zi)", response = response)
) %>%
mutate(Model = factor(Model, levels = c(
"zip (simple zi)", "zip (complex zi)",
"zinb (simple zi)", "zinb (complex zi)"
)))
head(newdata)
ggplot(newdata, aes(y = response, x = FoodTreatment)) +
geom_pointrange(aes(color = SexParent, ymin = lower.CL, ymax = upper.CL),
position = position_dodge(width = 0.2)
) +
facet_wrap(~Model, nrow = 1)
ggplot(newdata, aes(y = response, x = interaction(FoodTreatment, SexParent), color = Model)) +
geom_pointrange(aes(ymin = lower.CL, ymax = upper.CL), position = position_dodge(width = 0.2)) +
coord_flip()